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OBJECTIVES: The number of illicit drug users is prone to underestimation. This study aimed to use the cap- 
ture-recapture method as a statistical procedure for measuring the prevalence of intravenous drug users (IDUs) 
by estimating the number of unknown IDUs not registered by any of the registry centers. 

METHODS: This study was conducted in Hamadan City, the west of Iran, in 2012. Three incomplete data sour- 
ces of IDUs, with partial overlapping data, were assessed including: (a) Volunteer Counseling and Testing Cen- 
ters (VCTCs); (b) Drop in Centers (DICs); and (c) Outreach Teams (ORTs). A log-linear model was applied for 
the analysis of three-sample capture-recapture results. Two information criteria were used for model selection 
including Akaike's Information Criterion and the Bayesian Information Criterion . 

RESULTS: Out of 1,478 IDUs registered by three centers, 48% were identified by VCTCs, 32% by DICs, and 
20% by ORTs. After exclusion of duplicates, 1,369 IDUs remained. According to our findings, there were 9,964 
(95% CI, 6,088 to 17,636) IDUs not identified by any of the centers. Hence, the real number of IDUs is ex- 
pected to be 11,333. Based on these findings, the overall completeness of the three data sources was around 
12% (95% CI, 7% to 18%). 

CONCLUSION: There was a considerable number of IDUs not identified by any of the centers. Although the 
capture-recapture method is a useful and practical approach for estimating unknown populations, due to the 
assumptions and limitations of the method, the results must be interpreted with caution. 
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INTRODUCTION 

Illicit drug use is a major public health problem worldwide [1]. 
It is estimated that about 5% of the world's adult population, 
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or 230 million people, have used an illicit drug at least once in 
2010. The number of drug users worldwide is estimated to be 
about 27 million, which is 0.6% of the world's adult population 
[2]. About 15.9 million people worldwide may inject drugs, of 
whom about three million may be human immunodeficiency 
virus (HIV) positive [3]. Injecting drug use is the main route of 
transmission for about 10% of HIV infections in the world and 
30% of infections outside of sub Saharan Africa. Preventing 
HIV transmission through injecting drug use is one of the key 
challenges in reducing the burden of HIV worldwide [1]. 

It is estimated that between 1.2 and 2 million people are drug 
abusers in Iran of about 20% to 25% have injected at least once 
in their lifetime. Approximately 1% of the Iranian population 
use heroin, of which 40% inject it. Therefore, it is estimated that 
there are around 280 thousand injecting drug abusers in the 
country [4]. 
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Hamadan City, the center of Hamadan Province, stands on 
the north west of Iran and has a population of approximately 
600 thousand people [5]. According to the data recorded in the 
District Health Network database, 1,369 intravenous drug users 
(IDUs) were identified in the city in 2011, which seemed much 
less than that expected according to the national estimates. 

Illicit drug use is an illegal and stigmatized behavior and most 
drug users are reluctant to be identified as such. Hence, the num- 
ber of illicit drug users is prone to underestimation [3]. One me- 
thod which is commonly used to estimate hard-to-reach popu- 
lations is the so-called capture-recapture approach. This method 
was first used in ecology to estimate the unknown size of wild 
animal populations [6]. It was subsequently used as a potential- 
ly useful method in epidemiology for estimating the complete- 
ness of ascertainment of disease registers [7,8] and incomplete 
reporting [9,10]. Actually, the capture-recapture method can 
principally be applied to any situation where there are two or 
more incomplete lists. 

A high prevalence of HIV [4] and hepatitis C virus (HCV) 
[11] infections among IDUs emphasizes the importance of quan- 
tifying the dynamics of this population. Estimating the size of 
IDU populations is fundamental for predicting the future bur- 
den of HIV and HCV diseases and for planning and providing 
health care services for this group of individuals [12]. Further- 
more, statistically valid and reasonable estimates of IDU preva- 
lence can help policymakers to estimate probable outbreaks of 
infections associated with drug injection [13]. Populations with 
negative traits are usually underestimated. Several statistical 
methods have been suggested for the estimation of such un- 
known populations [14-16]. A straightforward and easy to use 
method is the capture-recapture approach. In this study, we used 
this method as a statistical procedure and a practical means for 
measuring IDU prevalence by estimating the number of un- 
known IDUs not registered by any of the three incomplete reg- 
istries. 



MATERIALS AND METHODS 

This study was approved by the Human Subject Review Board 
of Hamadan University of Medical Sciences and conducted in 
Hamadan City, the center of Hamadan Province in the west of 
Iran, in 2012. We extracted the data on IDUs registered in at 
least one the three centers that provide health services to IDUs 
in this city, from April 2011 to March 2012. IDUs were defined 
as people who inject narcotic substances into the body with a 
hollow needle and a syringe which is pierced through the skin 
into the body usually intravenously, but also intramuscularly or 
subcutaneously [17]. 

There are three centers for providing health care services to 



IDUs in Iran: (a) Volunteer Counseling andTesting Centers (VC- 
TCs); (b) Drop in Centers (DICs); and (c) Outreach Teams (ORTs). 
VCTCs provide consultant and educational services to IDUs in 
order to improve their knowledge of high-risk behaviors and 
harm reduction methods. In addition, these centers provide di- 
agnostic tests for IDUs and refer the suspected IDUs to special- 
ized medical centers for diagnostic and treatment services. DICs 
provide social support such as serving food, clothes, and bath- 
ing for IDUs. These centers also give agonist and maintenance 
medications. They also plan educational programs for IDUs in 
order to improve their knowledge of high-risk behaviors and 
healthy sexual behaviors. ORTs include a well-trained team of 
health workers who actively look for the IDUs not referred to 
VCTCs or DICs. They distribute safety boxes among IDUs in- 
cluding sterile syringes, disinfecting material and condoms. They 
also encourage IDUs to adopt safe and low-risk behaviors. The 
IDUs may be identified and registered by any of these centers. 
Some IDUs may be identified by more than one center. None- 
theless, none of these centers has a complete list of IDUs. We 
used the capture-recapture method in order to statistically esti- 
mate the approximate number of IDUs not identified by these 
registries. 

For this purpose, the list of IDUs recorded in these three data 
sources were extracted and compared with each other, in order 
to identify common names listed in more than one database. 
Since the IDUs' national identification codes were not recorded 
in the data sources, we used several demographic characteristics 
of the IDUs for comparison, including first name, second name, 
age, marital status, and region of residence. Then we arranged 
the data as shown in Figure 1. 

The simplest capture-recapture model is the so-called two- 
sample model [6] . Another capture-recapture method, which 



VCTCs 




ORCs 



Figure 1. Distribution of intravenous drug users registered by three 
centers: Volunteer Counseling and Testing Centers (VCTCs); Out- 
reach Teams (ORTs); and Drop in Centers (DICs). 
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was used in this study, is the so-called three-sample model, in 
which the three incomplete data sources of IDUs were used. In 
this approach, the capture-recapture method becomes more 
complicated and includes the following eight possible models 
as follows (Figure 1): 

1. Number of IDUs identified by VCTCs only (A) 

2. Number of IDUs identified by DICs only (B) 

3. Number of IDUs identified by ORTs only (C) 

4. Number of IDUs identified by A and B but not by C (AB) 

5. Number of IDUs identified byA and C but not by B (AC) 

6. Number of IDUs identified by B and C but not byA (BC) 

7. Number of IDUs identified by all three centers (ABC) 

8. Number of IDUs studies identified by none of the three 
centers (X) 

There are many elaborate statistical models available for the 
analysis of three-sample capture-recapture results. The log-lin- 
ear model, or Poisson regression, is a simple model which easily 
accommodates the three sources of data and is able to explore 
dependence between sources and adjust for dependence by in- 
cluding interaction terms in the model [18] . For this purpose, 
we prepared a dataset with four variables as follows: (a) "var_ 
A", with values 0 or 1, that described belonging to list A; (b) 
"var_B", with values 0 or 1, that described belonging to list B; 
(c) "var_C", with values 0 or 1, that described belonging to list 
C; and (d) "varjreq", which was a non-negative variable that 
described the frequency of observations in the combination of 
lists given by variables A, B and C. The unknown frequency of 
cases occurring in none of the lists was considered as missing. 
This missing value was estimated by the Poisson model. 

The interaction terms were used to model dependence. The 
basic assumption for the capture-recapture model is that there 
is no three-list interaction term for three lists. In other words, 
the third-order interaction vanishes (ABC = 0) [18]. By accom- 
modating the three sources of data as shown above, the log-lin- 
ear model can estimate the number of IDUs not identified by 
any of the three centers (X) and hence the total population of 
IDUs (N). 

There are two different information criteria which were pro- 
posed for model selection: Akaike's Information Criterion (AIC) 
and the Bayesian Information Criterion (BIC) [19]. The AIC is 
calculated as follows: 

AIC=G 2 -[2x(df)] 

Where G 2 is the likelihood ratio statistic associated with the 
fit of any model to the data and df denotes the degrees of free- 
dom of the model. The model giving the smallest value of AIC 
is the one selected. The second criterion, BIC, which is usually 
preferred to AIC in some applications, is calculated as follows: 

BIC=G 2 - [In (Nobs/2n)] x (df) 

Where G 2 and df are defined as above, and where In (Nobs) 
is natural logarithm of the number of parameters in the model. 



As above, the model giving the smallest value of BIC is the one 
selected. 

All analyses were performed at the 0.05 significance level us- 
ing statistical software Stata version 11.0 (StataCorp, College 
Station,TX, USA). 



RESULTS 

The general characteristics of the three data sources are shown 
in Table l.The majority of IDUs were single men aged 30 to 44 
years. Out of 1,478 IDUs identified by three centers, 710 (48%) 
were identified by VCTCs, 476 (32%) by DICs, and 292 (20%) 
by ORTs. In addition, 100 (7%) of IDUs were identified by at 
least two centers and 9 (1%) by all three centers (Figure 1). Af- 
ter exclusion of duplicates, 1,369 IDUs remained. 

The results of the log-linear analysis are shown in Table 2. The 
p-values show that there were significant differences between 
the saturated model (the 8th model) and all other reduced mo- 
dels. The sixth model (ABC AB BC) was the best fitting model 
with the smallest value of bothAIC and BIC. According to these 
findings, there might be 9,964 (95% confidence interval [CI], 
6,088, 17,636) IDUs who were not identified by any of the cen- 
ters. Hence, the real number of IDUs is expected to be 11,333 
(1,369 + 9,964). Accordingly, the combined completeness of the 
three data sources was almost 12% (95% CI, 7% to 18%). 
Among the three data sources, VCTCs were more complete than 

Table 1. Distribution of the demographic characteristics of the intra- 
venous drug users registered by three centers including Volunteer 
Counseling and Testing Centers (VCTCs); Outreach Teams (ORTs); 
and Drop in Centers (DICs) 



Characteristics 


VCTCs 


DICs 


ORTs 


Gender 








Male 


709 (99.86) 


367 (77.10) 


238(81.51) 


Female 


1 (0.14) 


109(22.90) 


54(18.49) 


Age (yr) 








<30 


242 (34.08) 


202 (42.44) 


211 (72.26) 


30-44 


380 (53.52) 


222 (46.64) 


70 (23.97) 


45-59 


85(11.97) 


44 (9.24) 


11 (3.77) 


>60 


3 (0.42) 


8(1.68) 


0 (0.00) 


Marital status 








Married 


246 (34.65) 


134(28.15) 


44(15.07) 


Single 


395 (55.63) 


254 (53.36) 


208 (71.23) 


Divorced 


68 (9.58) 


64(13.45) 


30(10.27) 


Widow 


1 (0.14) 


24 (5.04) 


10 (3.42) 


Region 








Urban 


701 (98.73) 


470 (98.74) 


288 (98.63) 


Rural 


9(1.27) 


6(1.26) 


4(1.37) 


Total 


710(100.00) 


476(100.00) 


292(100.00) 



Values are presented number (%). 
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Table 2. Log-linear models fitted to three lists of intravenous drug users registered by three centers and estimated number of intravenous 
drug users which was not registered by the centers 



Model 


K 


df 


Log likelihood 


G 2 


AIC 


BIC 


p-value 


X 


95% CI 


1.ABC 


3 


3 


-39.6971 


36.43 


30.43 


30.59 


0.001 


4,761 


3,852, 5,931 


2. A B C AB 


4 


2 


-38.5777 


34.19 


30.19 


30.30 


0.001 


4,078 


3,071, 5,491 


3. A B C AC 


4 


2 


-37.1667 


31.37 


27.37 


27.47 


0.001 


4,108 


3,229, 5,278 


4. A B C BC 


4 


2 


-25.2870 


7.61 


3.61 


3.71 


0.023 


6,768 


5,202, 8,948 


5. A B C AB AC 


5 


1 


-31.5662 


20.16 


18.16 


18.22 


0.001 


2,441 


1,717, 3,558 


6. A B C AB BC 


5 


1 


-23.5780 


4.19 


2.19 


2.24 


0.014 


9,964 


6,088, 17,636 


7. A B C AC BC 


5 


1 


-25.2076 


7.45 


5.45 


5.50 


0.006 


6,487 


4,682, 9,235 


8. A B C AB AC BC 


6 


0 


-21.4841 


0.00 


0.00 


0.00 


1.000 


24,298 


9,082, 62,706 



K, number of parameters; df, 
confidence interval. 



degree of freedom; G 2 , deviance; AIC, Akaike Information Criterion; BIC, Bayesian Information Criterion; X, unregistered; CI, 



Table 3. Comparison of the completeness of detecting intravenous 
drug users by the three centers including Volunteer Counseling and 
Testing Centers (VCTCs); Outreach Teams (ORTs); and Drop in Cen- 
ters (DICs) 



Data source 


n 


Completeness 


95% CI 


VCTCs 


710 


0.07 


0.04,0.12 


ORTs 


476 


0.05 


0.03, 0.08 


DICs 


292 


0.03 


0.02, 0.05 



the two other centers. 

Table 3 shows the completeness of detecting IDUs by the three 
centers. The completeness of the VCTCs was the highest and 
that of DICs was the lowest. However, the completeness values 
for all centers were very low. 



DISCUSSION 

Estimation of the size of IDU populations is essential for plan- 
ning and providing health care services to this group of individ- 
uals [12]. In addition, a valid and reliable estimate of the IDU 
population is needed for estimating the probable outbreaks of 
blood born infections associated with drug injection such as 
HIV and HCV [13]. Our findings revealed that the majority of 
IDUs were not identified by any of the centers. 

The p-values showed that there was a significant difference 
between the reduced models and the saturated one. We used 
AIC and BIC criteria in order to choose the best fitting model 
among the models, all of which were statistically significantly 
different to the saturated model. The sixth model with the small- 
est value of AIC or BIC was the one selected as the best fitting 
model. However, care must be taken when using AIC and BIC 
values for model selection. AIC and BIC do not provide a test 
of a model in the sense of testing a null hypothesis; i.e. AIC and 
BIC can tell nothing about how well a model fits the data in an 
absolute sense. If all candidate models fit the data poorly, these 
values will not give any warning of this [20]. 



One might try two-sample methods (AB, AC, and BC) and 
compare the results with the three-sample method (ABC). How- 
ever, it is important to remember that the capture-recapture me- 
thod requires a sufficiently high level of overlapping informa- 
tion to produce a reliable estimate of the unknown population 
[18]. Using the three-sample approach increases the overlap- 
ping area and thus produces more reliable estimates than that 
of the two-sample approach. Therefore, if possible, the three- 
sample approach is preferred. 

The completeness of the three lists was low and different. The 
proportion of overlapping cases was relatively low; however, 
sufficiently high overlapping information is required to produce 
a reliable estimate of the unknown population [18]. This might 
result in overestimation of the pooled estimates. This may ex- 
plain the low completeness of the data sources. Furthermore, 
the VCTCs, DICs and ORTs provide different services and IDUs 
may refer to these centers for different purposes. This might ex- 
plain the difference observed between the completeness of the 
three centers. 

Capture-recapture is a simple statistical approach and repre- 
sents an attractive method to assess the completeness of differ- 
ent data sources in order to estimate the size of the unknown 
population, which could not be identified from the data sourc- 
es. Although the capture-recapture method is a useful and prac- 
tical approach to estimate the hard-to-reach populations, due to 
the assumptions and limitations of the method, the results must 
be interpreted with caution. 

A number of studies used the three-sample capture-recapture 
method to estimate the population of IDUs in other countries. 
These studies revealed that registries underestimated the real 
size of the population of IDUs. Davies et al. [21] estimated the 
number of injecting drug users in the city of Edinburgh, Scot- 
land, in 1999, using the capture-recapture method. The number 
of IDUs was estimated to be 1,770 from the registry data, while 
the capture-recapture method estimated the number of IDUs 
to be 2,070. Accordingly, the completeness of the registry data 
was 86%. Hickman et al. [22] assessed IDU prevalence in Bris- 
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tol in 2004. They reported that the total number of IDUs re- 
corded by the three registries was 2,986, whereas the overall 
estimated number of IDUs using the capture-recapture method 
was 5,540. Based on these results, the overall completeness of 
the registries was 54%. Bohning et al. [23] and Goli et al. [16] 
used a different statistical method, the so-called mixture Pois- 
son model, to estimate the prevalence of heroin consumption 
in Bangkok and cigarette smokers in Hamadan in the west of 
Iran, respectively. 

Although the capture-recapture approach is a potentially use- 
ful method for estimating the size of an unknown population, 
this method, like any other statistical procedures, has its own 
limitations. A critical limitation of this method is that sufficient- 
ly high overlapping information is required to produce a reli- 
able estimate of the unknown population size. Otherwise, the 
likelihood functions may become flat and the resulting estimates 
based on log-linear models may possibly become unstable [18]. 
Another limitation of the capture-recapture method using log- 
linear models for investigating an unknown population is that a 
relative large number of data sources is required to hold the as- 
sumption of the normal distribution within log-linear models. 
Moreover, the validity of capture-recapture results depends on 
some assumptions. If these assumptions are not considered, the 
estimates may not be reliable. A critical assumption of capture- 
recapture methods is the independence of the data sources, so 
that either positively or negatively dependent sources may cause 
either underestimation or overestimation of the pooled estima- 
tes, respectively [6]. Of course, log-linear models are able to han- 
dle dependence among data sources and adjust for this depen- 
dence by including interaction terms in the model [18]. 

Despite its limitations, the current study indicated that the es- 
timated population of IDUs based on registries, and hence the 
prevalence estimates of IDUs, are considerably underestimated 
relative to the actual values. Therefore, the policymakers who 
plan and provide health care services for IDUs should take this 
issue into special consideration. 

The results of this study indicated that the capture-recapture 
method is an easy to conduct and straightforward approach for 
the estimation of the size of the hard-to-reach population of 
IDUs. According to these findings, the majority of the IDUs 
were not identified by the registry centers. Although the cap- 
ture-recapture method is a useful and practical approach to es- 
timate unknown populations, due to the assumptions and limi- 
tations of the method, the results must be interpreted with cau- 
tion. 
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